home *** CD-ROM | disk | FTP | other *** search
- % -*- Dictionary: design; Package: C -*-
-
- May be worth having a byte-code representation for interpreted code. This way,
- an entire system could be compiled into byte-code for debugging (the
- "check-out" compiler?).
-
- Given our current inclination for using a stack machine to interpret IR1, it
- would be straightforward to layer a byte-code interpreter on top of this.
-
-
- Interpreter:
-
- Instead of having no interpreter, or a more-or-less conventional interpreter,
- or byte-code interpreter, how about directly executing IR1?
-
- We run through the IR1 passes, possibly skipping optional ones, until we get
- through environment analysis. Then we run a post-pass that annotates IR1 with
- information about where values are kept, i.e. the stack slot.
-
- We can lazily convert functions by having FUNCTION make an interpreted function
- object that holds the code (really a closure over the interpreter). The first
- time that we try to call the function, we do the conversion and processing.
- Also, we can easily keep track of which interpreted functions we have expanded
- macros in, so that macro redefinition automatically invalidates the old
- expansion, causing lazy reconversion.
-
- Probably the interpreter will want to represent MVs by a recognizable structure
- that is always heap-allocated. This way, we can punt the stack issues involved
- in trying to spread MVs. So a continuation value can always be kept in a
- single cell.
-
- The compiler can have some special frobs for making the interpreter efficient,
- such as a call operation that extracts arguments from the stack
- slots designated by a continuation list. Perhaps
- (values-mapcar fun . lists)
- <==>
- (values-list (mapcar fun . lists))
- This would be used with MV-CALL.
-
-
- This scheme seems to provide nearly all of the advantages of both the compiler
- and conventional interpretation. The only significant disadvantage with
- respect to a conventional interpreter is that there is the one-time overhead of
- conversion, but doing this lazily should make this quite acceptable.
-
- With respect to a conventional interpreter, we have major advantages:
- + Full syntax checking: safety comparable to compiled code.
- + Semantics similar to compiled code due to code sharing. Similar diagnostic
- messages, etc. Reduction of error-prone code duplication.
- + Potential for full type checking according to declarations (would require
- running IR1 optimize?)
- + Simplifies debugger interface, since interpreted code can look more like
- compiled code: source paths, edit definition, etc.
-
- For all non-run-time symbol annotations (anything other than SYMBOL-FUNCTION
- and SYMBOL-VALUE), we use the compiler's global database. MACRO-FUNCTION will
- use INFO, rather than vice-versa.
-
- When doing the IR1 phases for the interpreter, we probably want to suppress
- optimizations that change user-visible function calls:
- -- Don't do local call conversion of any named functions (even lexical ones).
- This is so that a call will appear on the stack that looks like the call in
- the original source. The keyword and optional argument transformations
- done by local call mangle things quite a bit. Also, note local-call
- converting prevents unreferenced arguments from being deleted, which is
- another non-obvious transformation.
- -- Don't run source-transforms, IR1 transforms and IR1 optimizers. This way,
- TRACE and BACKTRACE will show calls with the original arguments, rather
- than the "optimized" form, etc. Also, for the interpreter it will
- actually be faster to call the original function (which is compiled) than
- to "inline expand" it. Also, this allows implementation-dependent
- transforms to expand into %PRIMITIVE uses.
-
- There are some problems with stepping, due to our non-syntactic IR1
- representation. The source path information is the key that makes this
- conceivable. We can skip over the stepping of a subform by quietly evaluating
- nodes whose source path lies within the form being skipped.
-
- One problem with determining what value has been returned by a form. With a
- function call, it is theoretically possible to precisely determine this, since
- if we complete evaluation of the arguments, then we arrive at the Combination
- node whose value is synonymous with the value of the form. We can even detect
- this case, since the Node-Source will be EQ to the form. And we can also
- detect when we unwind out of the evaluation, since we will leave the form
- without having ever reached this node.
-
- But with macros and special-forms, there is no node whose value is the value of
- the form, and no node whose source is the macro call or special form. We can
- still detect when we leave the form, but we can't be sure whether this was a
- normal evaluation result or an explicit RETURN-FROM.
-
- But does this really matter? It seems that we can print the value returned (if
- any), then just print the next form to step. In the rare case where we did
- unwind, the user should be able to figure it out.
-
- [We can look at this as a side-effect of CPS: there isn't any difference
- between a "normal" return and a non-local one.]
-
- [Note that in any control transfer (normal or otherwise), the stepper may need
- to unwind out of an arbitrary number of levels of stepping. This is because a
- form in a TR position may yield its to a node arbitrarily far our.]
-
- Another problem is with deciding what form is being stepped. When we start
- evaluating a node, we dive into code that is nested somewhere down inside that
- form. So we actually have to do a loop of asking questions before we do any
- evaluation. But what do we ask about?
-
- If we ask about the outermost enclosing form that is a subform of the the last
- form that the user said to execute, then we might offer a form that isn't
- really evaluated, such as a LET binding list.
-
- But once again, is this really a problem? It is certainly different from a
- conventional stepper, but a pretty good argument could be made that it is
- superior. Haven't you ever wanted to skip the evaluation of all the
- LET bindings, but not the body? Wouldn't it be useful to be able to skip the
- DO step forms?
-
- All of this assumes that nobody ever wants to step through the guts of a
- macroexpansion. This seems reasonable, since steppers are for weenies, and
- weenies don't define macros (hence don't debug them). But there are probably
- some weenies who don't know that they shouldn't be writing macros.
-
- We could handle this by finding the "source paths" in the expansion of each
- macro by sticking some special frob in the source path marking the place where
- the expansion happened. When we hit code again that is in the source, then we
- revert to the normal source path. Something along these lines might be a good
- idea anyway (for compiler error messages, for example).
-
- The source path hack isn't guaranteed to work quite so well in generated code,
- though, since macros return stuff that isn't freshly consed. But we could
- probably arrange to win as long as any given expansion doesn't return two EQ
- forms.
-
- It might be nice to have a command that skipped stepping of the form, but
- printed the results of each outermost enclosed evaluated subform, i.e. if you
- used this on the DO step-list, it would print the result of each new-value
- form. I think this is implementable. I guess what you would do is print each
- value delivered to a DEST whose source form is the current or an enclosing
- form. Along with the value, you would print the source form for the node that
- is computing the value.
-
- The stepper can also have a "back" command that "unskips" or "unsteps". This
- would allow the evaluation of forms that are pure (modulo lexical variable
- setting) to be undone. This is useful, since in stepping it is common that you
- skip a form that you shouldn't have, or get confused and want to restart at
- some earlier point.
-
- What we would do is remember the current node and the values of all local
- variables. heap before doing each step or skip action. We can then back up
- the state of all lexical variables and the "program counter". To make this
- work right with set closure variables, we would copy the cell's value, rather
- than the value cell itself.
-
- [To be fair, note that this could easily be done with our current interpreter:
- the stepper could copy the environment alists.]
-
- We can't back up the "program counter" when a control transfer leaves the
- current function, since this state is implicitly represented in the
- interpreter's state, and is discarded when we exit. We probably want to ask
- for confirmation before leaving the function to give users a chance to "unskip"
- the forms in a TR position.
-
- Another question is whether the conventional stepper is really a good thing to
- imitate... How about an editor-based mouse-driven interface? Instead of
- "skipping" and "stepping", you would just designate the next form that you
- wanted to stop at. Instead of displaying return values, you replace the source
- text with the printed representation of the value.
-
- It would show the "program counter" by highlighting the *innermost* form that
- we are about to evaluate, i.e. the source form for the node that we are stopped
- at. It would probably also be useful to display the start of the form that was
- used to designate the next stopping point, although I guess this could be
- implied by the mouse position.
-
-
- Such an interface would be a little harder to implement than a dumb stepper,
- but it would be much easier to use. [It would be impossible for an evalhook
- stepper to do this.]
-
-
- %PRIMITIVE usage:
-
- Note: %PRIMITIVE can only be used in compiled code. It is a trapdoor into the
- compiler, not a general syntax for accessing "sub-primitives". It's main use
- is in implementation-dependent compiler transforms. It saves us the effort of
- defining a "phony function" (that is not really defined), and also allows
- direct communication with the code generator through codegen-info arguments.
-
- Some primitives may be exported from the VM so that %PRIMITIVE can be used to
- make it explicit that an escape routine or interpreter stub is assuming an
- operation is implemented by the compiler.
-